Overview

Dataset statistics

Number of variables22
Number of observations663
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory77.6 KiB
Average record size in memory119.8 B

Variable types

NUM13
CAT6
BOOL3

Warnings

Disciplinary failure has constant value "663" Constant
Education is highly correlated with IDHigh correlation
ID is highly correlated with EducationHigh correlation
df_index has unique values Unique
Children has 261 (39.4%) zeros Zeros
Pet has 404 (60.9%) zeros Zeros

Reproduction

Analysis started2020-11-11 16:06:06.909984
Analysis finished2020-11-11 16:06:40.203339
Duration33.29 seconds
Software versionpandas-profiling v2.9.0
Download configurationconfig.yaml

Variables

df_index
Real number (ℝ≥0)

UNIQUE

Distinct663
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean367.7149321
Minimum0
Maximum736
Zeros1
Zeros (%)0.2%
Memory size5.3 KiB

Quantile statistics

Minimum0
5-th percentile35.1
Q1181.5
median369
Q3551
95-th percentile700.7
Maximum736
Range736
Interquartile range (IQR)369.5

Descriptive statistics

Standard deviation213.0937382
Coefficient of variation (CV)0.5795079819
Kurtosis-1.205649484
Mean367.7149321
Median Absolute Deviation (MAD)185
Skewness-0.003989991574
Sum243795
Variance45408.94127
MonotocityStrictly increasing
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
73610.2%
 
24710.2%
 
24510.2%
 
24410.2%
 
24310.2%
 
24210.2%
 
24110.2%
 
24010.2%
 
23910.2%
 
23810.2%
 
23710.2%
 
23610.2%
 
23510.2%
 
23410.2%
 
23310.2%
 
23210.2%
 
23110.2%
 
23010.2%
 
22910.2%
 
24610.2%
 
24810.2%
 
26910.2%
 
24910.2%
 
26710.2%
 
26610.2%
 
Other values (638)63896.2%
 
ValueCountFrequency (%) 
010.2%
 
210.2%
 
310.2%
 
410.2%
 
510.2%
 
610.2%
 
710.2%
 
810.2%
 
910.2%
 
1010.2%
 
ValueCountFrequency (%) 
73610.2%
 
73510.2%
 
73410.2%
 
73310.2%
 
73210.2%
 
73110.2%
 
73010.2%
 
72910.2%
 
72810.2%
 
72710.2%
 

ID
Categorical

HIGH CORRELATION

Distinct33
Distinct (%)5.0%
Missing0
Missing (%)0.0%
Memory size2.3 KiB
3
96 
28
73 
34
50 
22
41 
20
39 
Other values (28)
364 
ValueCountFrequency (%) 
39614.5%
 
287311.0%
 
34507.5%
 
22416.2%
 
20395.9%
 
11385.7%
 
15345.1%
 
14294.4%
 
36284.2%
 
24274.1%
 
33233.5%
 
10233.5%
 
1223.3%
 
17203.0%
 
18142.1%
 
5132.0%
 
13132.0%
 
25101.5%
 
681.2%
 
981.2%
 
1271.1%
 
3060.9%
 
2760.9%
 
3250.8%
 
2350.8%
 
Other values (8)253.8%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length2
Median length2
Mean length1.766214178
Min length1

Overview of Unicode Properties

Unique unicode characters10
Unique unicode categories1 ?
Unique unicode scripts1 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
226923.0%
 
325121.4%
 
124621.0%
 
41069.1%
 
8877.4%
 
0685.8%
 
5574.9%
 
6433.7%
 
7302.6%
 
9141.2%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number1171100.0%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
226923.0%
 
325121.4%
 
124621.0%
 
41069.1%
 
8877.4%
 
0685.8%
 
5574.9%
 
6433.7%
 
7302.6%
 
9141.2%
 

Most occurring scripts

ValueCountFrequency (%) 
Common1171100.0%
 

Most frequent Common characters

ValueCountFrequency (%) 
226923.0%
 
325121.4%
 
124621.0%
 
41069.1%
 
8877.4%
 
0685.8%
 
5574.9%
 
6433.7%
 
7302.6%
 
9141.2%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII1171100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
226923.0%
 
325121.4%
 
124621.0%
 
41069.1%
 
8877.4%
 
0685.8%
 
5574.9%
 
6433.7%
 
7302.6%
 
9141.2%
 
Distinct27
Distinct (%)4.1%
Missing0
Missing (%)0.0%
Memory size2.2 KiB
23
142 
28
108 
13
55 
27
47 
19
40 
Other values (22)
271 
ValueCountFrequency (%) 
2314221.4%
 
2810816.3%
 
13558.3%
 
27477.1%
 
19406.0%
 
22375.6%
 
26335.0%
 
25314.7%
 
11263.9%
 
10253.8%
 
18213.2%
 
14192.9%
 
1162.4%
 
7152.3%
 
1281.2%
 
681.2%
 
2160.9%
 
860.9%
 
940.6%
 
1630.5%
 
2430.5%
 
530.5%
 
1520.3%
 
420.3%
 
1710.2%
 
Other values (2)20.3%
 
Frequencies of value counts

Unique

Unique3 ?
Unique (%)0.5%
Histogram of lengths of the category

Length

Max length2
Median length2
Mean length1.915535445
Min length1

Overview of Unicode Properties

Unique unicode characters10
Unique unicode categories1 ?
Unique unicode scripts1 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
245335.7%
 
124819.5%
 
319815.6%
 
813510.6%
 
7635.0%
 
6443.5%
 
9443.5%
 
5362.8%
 
0252.0%
 
4241.9%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number1270100.0%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
245335.7%
 
124819.5%
 
319815.6%
 
813510.6%
 
7635.0%
 
6443.5%
 
9443.5%
 
5362.8%
 
0252.0%
 
4241.9%
 

Most occurring scripts

ValueCountFrequency (%) 
Common1270100.0%
 

Most frequent Common characters

ValueCountFrequency (%) 
245335.7%
 
124819.5%
 
319815.6%
 
813510.6%
 
7635.0%
 
6443.5%
 
9443.5%
 
5362.8%
 
0252.0%
 
4241.9%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII1270100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
245335.7%
 
124819.5%
 
319815.6%
 
813510.6%
 
7635.0%
 
6443.5%
 
9443.5%
 
5362.8%
 
0252.0%
 
4241.9%
 

Month of absence
Categorical

Distinct12
Distinct (%)1.8%
Missing0
Missing (%)0.0%
Memory size1.5 KiB
3
79 
7
63 
2
62 
10
60 
11
56 
Other values (7)
343 
ValueCountFrequency (%) 
37911.9%
 
7639.5%
 
2629.4%
 
10609.0%
 
11568.4%
 
5558.3%
 
8548.1%
 
6517.7%
 
4497.4%
 
12466.9%
 
1456.8%
 
9436.5%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length2
Median length1
Mean length1.244343891
Min length1

Overview of Unicode Properties

Unique unicode characters10
Unique unicode categories1 ?
Unique unicode scripts1 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
126331.9%
 
210813.1%
 
3799.6%
 
7637.6%
 
0607.3%
 
5556.7%
 
8546.5%
 
6516.2%
 
4495.9%
 
9435.2%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number825100.0%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
126331.9%
 
210813.1%
 
3799.6%
 
7637.6%
 
0607.3%
 
5556.7%
 
8546.5%
 
6516.2%
 
4495.9%
 
9435.2%
 

Most occurring scripts

ValueCountFrequency (%) 
Common825100.0%
 

Most frequent Common characters

ValueCountFrequency (%) 
126331.9%
 
210813.1%
 
3799.6%
 
7637.6%
 
0607.3%
 
5556.7%
 
8546.5%
 
6516.2%
 
4495.9%
 
9435.2%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII825100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
126331.9%
 
210813.1%
 
3799.6%
 
7637.6%
 
0607.3%
 
5556.7%
 
8546.5%
 
6516.2%
 
4495.9%
 
9435.2%
 

Day of the week
Categorical

Distinct5
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size991.0 B
2
152 
3
137 
4
133 
6
129 
5
112 
ValueCountFrequency (%) 
215222.9%
 
313720.7%
 
413320.1%
 
612919.5%
 
511216.9%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1

Overview of Unicode Properties

Unique unicode characters5
Unique unicode categories1 ?
Unique unicode scripts1 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
215222.9%
 
313720.7%
 
413320.1%
 
612919.5%
 
511216.9%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number663100.0%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
215222.9%
 
313720.7%
 
413320.1%
 
612919.5%
 
511216.9%
 

Most occurring scripts

ValueCountFrequency (%) 
Common663100.0%
 

Most frequent Common characters

ValueCountFrequency (%) 
215222.9%
 
313720.7%
 
413320.1%
 
612919.5%
 
511216.9%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII663100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
215222.9%
 
313720.7%
 
413320.1%
 
612919.5%
 
511216.9%
 

Seasons
Categorical

Distinct4
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size983.0 B
2
171 
4
168 
3
163 
1
161 
ValueCountFrequency (%) 
217125.8%
 
416825.3%
 
316324.6%
 
116124.3%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1

Overview of Unicode Properties

Unique unicode characters4
Unique unicode categories1 ?
Unique unicode scripts1 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
217125.8%
 
416825.3%
 
316324.6%
 
116124.3%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number663100.0%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
217125.8%
 
416825.3%
 
316324.6%
 
116124.3%
 

Most occurring scripts

ValueCountFrequency (%) 
Common663100.0%
 

Most frequent Common characters

ValueCountFrequency (%) 
217125.8%
 
416825.3%
 
316324.6%
 
116124.3%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII663100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
217125.8%
 
416825.3%
 
316324.6%
 
116124.3%
 

Transportation expense
Real number (ℝ≥0)

Distinct23
Distinct (%)3.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean221.4449472
Minimum118
Maximum388
Zeros0
Zeros (%)0.0%
Memory size5.3 KiB

Quantile statistics

Minimum118
5-th percentile118
Q1179
median225
Q3260
95-th percentile361
Maximum388
Range270
Interquartile range (IQR)81

Descriptive statistics

Standard deviation66.22377769
Coefficient of variation (CV)0.2990530086
Kurtosis-0.3329517395
Mean221.4449472
Median Absolute Deviation (MAD)46
Skewness0.372029783
Sum146818
Variance4385.588732
MonotocityNot monotonic
Histogram with fixed size bins (bins=23)
ValueCountFrequency (%) 
17915723.7%
 
1188012.1%
 
2257711.6%
 
235497.4%
 
289436.5%
 
260395.9%
 
291365.4%
 
155294.4%
 
246274.1%
 
248233.5%
 
361233.5%
 
330142.1%
 
369132.0%
 
22881.2%
 
18981.2%
 
23371.1%
 
18460.9%
 
15760.9%
 
37850.8%
 
30050.8%
 
27940.6%
 
26820.3%
 
38820.3%
 
ValueCountFrequency (%) 
1188012.1%
 
155294.4%
 
15760.9%
 
17915723.7%
 
18460.9%
 
18981.2%
 
2257711.6%
 
22881.2%
 
23371.1%
 
235497.4%
 
ValueCountFrequency (%) 
38820.3%
 
37850.8%
 
369132.0%
 
361233.5%
 
330142.1%
 
30050.8%
 
291365.4%
 
289436.5%
 
27940.6%
 
26820.3%
 

Distance from Residence to Work
Real number (ℝ≥0)

Distinct23
Distinct (%)3.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean29.47963801
Minimum5
Maximum52
Zeros0
Zeros (%)0.0%
Memory size5.3 KiB

Quantile statistics

Minimum5
5-th percentile10
Q116
median26
Q350
95-th percentile51
Maximum52
Range47
Interquartile range (IQR)34

Descriptive statistics

Standard deviation14.73161075
Coefficient of variation (CV)0.4997215617
Kurtosis-1.229046189
Mean29.47963801
Median Absolute Deviation (MAD)11
Skewness0.3355102368
Sum19545
Variance217.0203552
MonotocityNot monotonic
Histogram with fixed size bins (bins=23)
ValueCountFrequency (%) 
2611917.9%
 
5110315.5%
 
25507.5%
 
10507.5%
 
50416.2%
 
36385.7%
 
31345.1%
 
12294.4%
 
13284.2%
 
11243.6%
 
16243.6%
 
52233.5%
 
22203.0%
 
20132.0%
 
17132.0%
 
29121.8%
 
1581.2%
 
1481.2%
 
2760.9%
 
4260.9%
 
4850.8%
 
4950.8%
 
540.6%
 
ValueCountFrequency (%) 
540.6%
 
10507.5%
 
11243.6%
 
12294.4%
 
13284.2%
 
1481.2%
 
1581.2%
 
16243.6%
 
17132.0%
 
20132.0%
 
ValueCountFrequency (%) 
52233.5%
 
5110315.5%
 
50416.2%
 
4950.8%
 
4850.8%
 
4260.9%
 
36385.7%
 
31345.1%
 
29121.8%
 
2760.9%
 

Service time
Real number (ℝ≥0)

Distinct18
Distinct (%)2.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean12.48717949
Minimum1
Maximum29
Zeros0
Zeros (%)0.0%
Memory size5.3 KiB

Quantile statistics

Minimum1
5-th percentile4
Q19
median13
Q316
95-th percentile18
Maximum29
Range28
Interquartile range (IQR)7

Descriptive statistics

Standard deviation4.425216301
Coefficient of variation (CV)0.3543807716
Kurtosis0.789208855
Mean12.48717949
Median Absolute Deviation (MAD)4
Skewness0.029294716
Sum8279
Variance19.58253931
MonotocityNot monotonic
Histogram with fixed size bins (bins=18)
ValueCountFrequency (%) 
1812418.7%
 
911617.5%
 
147811.8%
 
13649.7%
 
12538.0%
 
10507.5%
 
11446.6%
 
16355.3%
 
3233.5%
 
17203.0%
 
4142.1%
 
8121.8%
 
171.1%
 
760.9%
 
660.9%
 
2950.8%
 
1540.6%
 
2420.3%
 
ValueCountFrequency (%) 
171.1%
 
3233.5%
 
4142.1%
 
660.9%
 
760.9%
 
8121.8%
 
911617.5%
 
10507.5%
 
11446.6%
 
12538.0%
 
ValueCountFrequency (%) 
2950.8%
 
2420.3%
 
1812418.7%
 
17203.0%
 
16355.3%
 
1540.6%
 
147811.8%
 
13649.7%
 
12538.0%
 
11446.6%
 

Age
Real number (ℝ≥0)

Distinct21
Distinct (%)3.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean36.2760181
Minimum27
Maximum58
Zeros0
Zeros (%)0.0%
Memory size5.3 KiB

Quantile statistics

Minimum27
5-th percentile28
Q131
median37
Q340
95-th percentile50
Maximum58
Range31
Interquartile range (IQR)9

Descriptive statistics

Standard deviation6.463292667
Coefficient of variation (CV)0.178169849
Kurtosis0.5721696934
Mean36.2760181
Median Absolute Deviation (MAD)4
Skewness0.7471926819
Sum24051
Variance41.7741521
MonotocityNot monotonic
Histogram with fixed size bins (bins=21)
ValueCountFrequency (%) 
2811016.6%
 
389614.5%
 
377210.9%
 
40548.1%
 
33487.2%
 
36446.6%
 
30416.2%
 
41314.7%
 
50304.5%
 
34294.4%
 
47233.5%
 
31203.0%
 
43182.7%
 
32121.8%
 
5881.2%
 
2960.9%
 
2760.9%
 
4950.8%
 
3940.6%
 
4840.6%
 
4620.3%
 
ValueCountFrequency (%) 
2760.9%
 
2811016.6%
 
2960.9%
 
30416.2%
 
31203.0%
 
32121.8%
 
33487.2%
 
34294.4%
 
36446.6%
 
377210.9%
 
ValueCountFrequency (%) 
5881.2%
 
50304.5%
 
4950.8%
 
4840.6%
 
47233.5%
 
4620.3%
 
43182.7%
 
41314.7%
 
40548.1%
 
3940.6%
 

Work load Average/day
Real number (ℝ≥0)

Distinct37
Distinct (%)5.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean271.844175
Minimum205.917
Maximum378.884
Zeros0
Zeros (%)0.0%
Memory size5.3 KiB

Quantile statistics

Minimum205.917
5-th percentile222.196
Q1244.387
median264.249
Q3294.217
95-th percentile343.253
Maximum378.884
Range172.967
Interquartile range (IQR)49.83

Descriptive statistics

Standard deviation39.52583048
Coefficient of variation (CV)0.1453988502
Kurtosis0.448003128
Mean271.844175
Median Absolute Deviation (MAD)22.773
Skewness0.9029618989
Sum180232.688
Variance1562.291275
MonotocityNot monotonic
Histogram with fixed size bins (bins=37)
ValueCountFrequency (%) 
222.196324.8%
 
343.253294.4%
 
237.656284.2%
 
264.249284.2%
 
284.853243.6%
 
205.917213.2%
 
265.017203.0%
 
268.519203.0%
 
326.452192.9%
 
284.031192.9%
 
308.593192.9%
 
230.29192.9%
 
265.615182.7%
 
236.629182.7%
 
253.957182.7%
 
302.585182.7%
 
244.387182.7%
 
275.089172.6%
 
306.345172.6%
 
241.476172.6%
 
239.554172.6%
 
246.288172.6%
 
377.55162.4%
 
253.465162.4%
 
251.818162.4%
 
Other values (12)16224.4%
 
ValueCountFrequency (%) 
205.917213.2%
 
222.196324.8%
 
230.29192.9%
 
236.629182.7%
 
237.656284.2%
 
239.409132.0%
 
239.554172.6%
 
241.476172.6%
 
244.387182.7%
 
246.074152.3%
 
ValueCountFrequency (%) 
378.884121.8%
 
377.55162.4%
 
343.253294.4%
 
330.061111.7%
 
326.452192.9%
 
313.532152.3%
 
308.593192.9%
 
306.345172.6%
 
302.585182.7%
 
294.217152.3%
 

Hit target
Real number (ℝ≥0)

Distinct13
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean94.68476621
Minimum81
Maximum100
Zeros0
Zeros (%)0.0%
Memory size5.3 KiB

Quantile statistics

Minimum81
5-th percentile88
Q193
median95
Q397
95-th percentile99
Maximum100
Range19
Interquartile range (IQR)4

Descriptive statistics

Standard deviation3.675491652
Coefficient of variation (CV)0.03881819429
Kurtosis2.557680858
Mean94.68476621
Median Absolute Deviation (MAD)2
Skewness-1.246238273
Sum62776
Variance13.50923888
MonotocityNot monotonic
Histogram with fixed size bins (bins=13)
ValueCountFrequency (%) 
939814.8%
 
999314.0%
 
977811.8%
 
926910.4%
 
966710.1%
 
956610.0%
 
98609.0%
 
91416.2%
 
94345.1%
 
88203.0%
 
81152.3%
 
100111.7%
 
87111.7%
 
ValueCountFrequency (%) 
81152.3%
 
87111.7%
 
88203.0%
 
91416.2%
 
926910.4%
 
939814.8%
 
94345.1%
 
956610.0%
 
966710.1%
 
977811.8%
 
ValueCountFrequency (%) 
100111.7%
 
999314.0%
 
98609.0%
 
977811.8%
 
966710.1%
 
956610.0%
 
94345.1%
 
939814.8%
 
926910.4%
 
91416.2%
 

Disciplinary failure
Boolean

CONSTANT
REJECTED

Distinct1
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size791.0 B
False
663 
ValueCountFrequency (%) 
False663100.0%
 

Education
Categorical

HIGH CORRELATION

Distinct4
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size983.0 B
1
543 
3
73 
2
 
43
4
 
4
ValueCountFrequency (%) 
154381.9%
 
37311.0%
 
2436.5%
 
440.6%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1

Overview of Unicode Properties

Unique unicode characters4
Unique unicode categories1 ?
Unique unicode scripts1 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
154381.9%
 
37311.0%
 
2436.5%
 
440.6%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number663100.0%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
154381.9%
 
37311.0%
 
2436.5%
 
440.6%
 

Most occurring scripts

ValueCountFrequency (%) 
Common663100.0%
 

Most frequent Common characters

ValueCountFrequency (%) 
154381.9%
 
37311.0%
 
2436.5%
 
440.6%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII663100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
154381.9%
 
37311.0%
 
2436.5%
 
440.6%
 

Children
Real number (ℝ≥0)

ZEROS

Distinct5
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.042232278
Minimum0
Maximum4
Zeros261
Zeros (%)39.4%
Memory size5.3 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q32
95-th percentile4
Maximum4
Range4
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.105340986
Coefficient of variation (CV)1.060551481
Kurtosis0.6693052233
Mean1.042232278
Median Absolute Deviation (MAD)1
Skewness1.054024465
Sum691
Variance1.221778695
MonotocityNot monotonic
Histogram with fixed size bins (bins=5)
ValueCountFrequency (%) 
026139.4%
 
120430.8%
 
214622.0%
 
4395.9%
 
3132.0%
 
ValueCountFrequency (%) 
026139.4%
 
120430.8%
 
214622.0%
 
3132.0%
 
4395.9%
 
ValueCountFrequency (%) 
4395.9%
 
3132.0%
 
214622.0%
 
120430.8%
 
026139.4%
 
Distinct2
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size791.0 B
True
370 
False
293 
ValueCountFrequency (%) 
True37055.8%
 
False29344.2%
 
Distinct2
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size791.0 B
False
617 
True
 
46
ValueCountFrequency (%) 
False61793.1%
 
True466.9%
 

Pet
Real number (ℝ≥0)

ZEROS

Distinct6
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.7511312217
Minimum0
Maximum8
Zeros404
Zeros (%)60.9%
Memory size5.3 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile4
Maximum8
Range8
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.296017984
Coefficient of variation (CV)1.725421533
Kurtosis10.0069227
Mean0.7511312217
Median Absolute Deviation (MAD)0
Skewness2.741895457
Sum498
Variance1.679662616
MonotocityNot monotonic
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%) 
040460.9%
 
113019.6%
 
29013.6%
 
4284.2%
 
871.1%
 
540.6%
 
ValueCountFrequency (%) 
040460.9%
 
113019.6%
 
29013.6%
 
4284.2%
 
540.6%
 
871.1%
 
ValueCountFrequency (%) 
871.1%
 
540.6%
 
4284.2%
 
29013.6%
 
113019.6%
 
040460.9%
 

Weight
Real number (ℝ≥0)

Distinct25
Distinct (%)3.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean78.71191554
Minimum56
Maximum108
Zeros0
Zeros (%)0.0%
Memory size5.3 KiB

Quantile statistics

Minimum56
5-th percentile56
Q169
median80
Q389
95-th percentile98
Maximum108
Range52
Interquartile range (IQR)20

Descriptive statistics

Standard deviation12.69107591
Coefficient of variation (CV)0.1612344944
Kurtosis-0.9199774004
Mean78.71191554
Median Absolute Deviation (MAD)10
Skewness0.02064129444
Sum52186
Variance161.0634077
MonotocityNot monotonic
Histogram with fixed size bins (bins=25)
ValueCountFrequency (%) 
899614.5%
 
698112.2%
 
65548.1%
 
83507.5%
 
56416.2%
 
90385.7%
 
73345.1%
 
95294.4%
 
98284.2%
 
67274.1%
 
88263.9%
 
86233.5%
 
80233.5%
 
63203.0%
 
75182.7%
 
84142.1%
 
106132.0%
 
70132.0%
 
68111.7%
 
5860.9%
 
10850.8%
 
7750.8%
 
9440.6%
 
7620.3%
 
7920.3%
 
ValueCountFrequency (%) 
56416.2%
 
5860.9%
 
63203.0%
 
65548.1%
 
67274.1%
 
68111.7%
 
698112.2%
 
70132.0%
 
73345.1%
 
75182.7%
 
ValueCountFrequency (%) 
10850.8%
 
106132.0%
 
98284.2%
 
95294.4%
 
9440.6%
 
90385.7%
 
899614.5%
 
88263.9%
 
86233.5%
 
84142.1%
 

Height
Real number (ℝ≥0)

Distinct14
Distinct (%)2.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean172.2262443
Minimum163
Maximum196
Zeros0
Zeros (%)0.0%
Memory size5.3 KiB

Quantile statistics

Minimum163
5-th percentile167
Q1169
median171
Q3172
95-th percentile185
Maximum196
Range33
Interquartile range (IQR)3

Descriptive statistics

Standard deviation6.216640682
Coefficient of variation (CV)0.03609578033
Kurtosis6.969340933
Mean172.2262443
Median Absolute Deviation (MAD)1
Skewness2.554069834
Sum114186
Variance38.64662137
MonotocityNot monotonic
Histogram with fixed size bins (bins=14)
ValueCountFrequency (%) 
17214622.0%
 
17014321.6%
 
1698813.3%
 
1717511.3%
 
178497.4%
 
168436.5%
 
196294.4%
 
167274.1%
 
165233.5%
 
182182.7%
 
17571.1%
 
18560.9%
 
17450.8%
 
16340.6%
 
ValueCountFrequency (%) 
16340.6%
 
165233.5%
 
167274.1%
 
168436.5%
 
1698813.3%
 
17014321.6%
 
1717511.3%
 
17214622.0%
 
17450.8%
 
17571.1%
 
ValueCountFrequency (%) 
196294.4%
 
18560.9%
 
182182.7%
 
178497.4%
 
17571.1%
 
17450.8%
 
17214622.0%
 
1717511.3%
 
17014321.6%
 
1698813.3%
 

Body mass index
Real number (ℝ≥0)

Distinct15
Distinct (%)2.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean26.52639517
Minimum19
Maximum38
Zeros0
Zeros (%)0.0%
Memory size5.3 KiB

Quantile statistics

Minimum19
5-th percentile19
Q124
median25
Q331
95-th percentile32
Maximum38
Range19
Interquartile range (IQR)7

Descriptive statistics

Standard deviation4.150726665
Coefficient of variation (CV)0.1564753385
Kurtosis-0.3097806716
Mean26.52639517
Median Absolute Deviation (MAD)3
Skewness0.2882197173
Sum17587
Variance17.22853185
MonotocityNot monotonic
Histogram with fixed size bins (bins=15)
ValueCountFrequency (%) 
3112418.7%
 
2511717.6%
 
247911.9%
 
236810.3%
 
28548.1%
 
19416.2%
 
30385.7%
 
22345.1%
 
32233.5%
 
27233.5%
 
29223.3%
 
21182.7%
 
38132.0%
 
3650.8%
 
3340.6%
 
ValueCountFrequency (%) 
19416.2%
 
21182.7%
 
22345.1%
 
236810.3%
 
247911.9%
 
2511717.6%
 
27233.5%
 
28548.1%
 
29223.3%
 
30385.7%
 
ValueCountFrequency (%) 
38132.0%
 
3650.8%
 
3340.6%
 
32233.5%
 
3112418.7%
 
30385.7%
 
29223.3%
 
28548.1%
 
27233.5%
 
2511717.6%
 

Absenteeism time in hours
Real number (ℝ≥0)

Distinct19
Distinct (%)2.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7.606334842
Minimum0
Maximum120
Zeros1
Zeros (%)0.2%
Memory size5.3 KiB

Quantile statistics

Minimum0
5-th percentile1
Q12
median4
Q38
95-th percentile24
Maximum120
Range120
Interquartile range (IQR)6

Descriptive statistics

Standard deviation13.91690127
Coefficient of variation (CV)1.829646152
Kurtosis35.20063653
Mean7.606334842
Median Absolute Deviation (MAD)3
Skewness5.477282659
Sum5043
Variance193.6801411
MonotocityNot monotonic
Histogram with fixed size bins (bins=19)
ValueCountFrequency (%) 
820731.2%
 
213720.7%
 
310215.4%
 
18512.8%
 
4609.0%
 
16192.9%
 
24162.4%
 
571.1%
 
4071.1%
 
3260.9%
 
12030.5%
 
6430.5%
 
8030.5%
 
11220.3%
 
5620.3%
 
710.2%
 
4810.2%
 
10410.2%
 
010.2%
 
ValueCountFrequency (%) 
010.2%
 
18512.8%
 
213720.7%
 
310215.4%
 
4609.0%
 
571.1%
 
710.2%
 
820731.2%
 
16192.9%
 
24162.4%
 
ValueCountFrequency (%) 
12030.5%
 
11220.3%
 
10410.2%
 
8030.5%
 
6430.5%
 
5620.3%
 
4810.2%
 
4071.1%
 
3260.9%
 
24162.4%
 

Interactions

Correlations

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

Sample

First rows

df_indexIDReason for absenceMonth of absenceDay of the weekSeasonsTransportation expenseDistance from Residence to WorkService timeAgeWork load Average/dayHit targetDisciplinary failureEducationChildrenSocial drinkerSocial smokerPetWeightHeightBody mass indexAbsenteeism time in hours
001126731289361333239.55497False12TrueFalse190172304
12323741179511838239.55497False10TrueFalse089170312
237775127951439239.55497False12TrueTrue068168244
341123751289361333239.55497False12TrueFalse190172302
45323761179511838239.55497False10TrueFalse089170312
56102276136152328239.55497False11TrueFalse480172278
672023761260501136239.55497False14TrueFalse065168234
781419721155121434239.55497False12TrueFalse0951962540
89122721235111437239.55497False31FalseFalse188172298
910201721260501136239.55497False14TrueFalse065168238

Last rows

df_indexIDReason for absenceMonth of absenceDay of the weekSeasonsTransportation expenseDistance from Residence to WorkService timeAgeWork load Average/dayHit targetDisciplinary failureEducationChildrenSocial drinkerSocial smokerPetWeightHeightBody mass indexAbsenteeism time in hours
65372796721228141658264.60493False12FalseFalse165172228
6547283428721118101037264.60493False10FalseFalse083172284
65572996731228141658264.60493False12FalseFalse16517222120
656730622731189291333264.60493False12FalseFalse2691672516
6577313423741118101037264.60493False10FalseFalse083172282
658732102274136152328264.60493False11TrueFalse480172278
659733282274122526928264.60493False11FalseFalse269169248
6607341313721369171231264.60493False13TrueFalse0701692580
6617351114731289361333264.60493False12TrueFalse190172308
662736111731235111437264.60493False31FalseFalse188172294